The relationship between classification of multi-domain proteins using an alignment-free approach and their functions: a case study with immunoglobulins.
نویسندگان
چکیده
Establishing functional relationships between multi-domain protein sequences is a non-trivial task. Traditionally, delineating functional assignment and relationships of proteins requires domain assignments as a prerequisite. This process is sensitive to alignment quality and domain definitions. In multi-domain proteins due to multiple reasons, the quality of alignments is poor. We report the correspondence between the classification of proteins represented as full-length gene products and their functions. Our approach differs fundamentally from traditional methods in not performing the classification at the level of domains. Our method is based on an alignment free local matching scores (LMS) computation at the amino-acid sequence level followed by hierarchical clustering. As there are no gold standards for full-length protein sequence classification, we resorted to Gene Ontology and domain-architecture based similarity measures to assess our classification. The final clusters obtained using LMS show high functional and domain architectural similarities. Comparison of the current method with alignment based approaches at both domain and full-length protein showed superiority of the LMS scores. Using this method we have recreated objective relationships among different protein kinase sub-families and also classified immunoglobulin containing proteins where sub-family definitions do not exist currently. This method can be applied to any set of protein sequences and hence will be instrumental in analysis of large numbers of full-length protein sequences.
منابع مشابه
Image Classification via Sparse Representation and Subspace Alignment
Image representation is a crucial problem in image processing where there exist many low-level representations of image, i.e., SIFT, HOG and so on. But there is a missing link across low-level and high-level semantic representations. In fact, traditional machine learning approaches, e.g., non-negative matrix factorization, sparse representation and principle component analysis are employed to d...
متن کاملArabidopsis leaf plasma membrane proteome using a gel free method: Focus on receptor–like kinases
The hydrophobic proteins of plant plasma membrane still remain largely unknown. For example in the Arabidopsis genome, receptor-like kinases (RLKs) are plasma membrane proteins, functioning as the primary receptors in the signaling of stress conditions, hormones and the presence of pathogens form a diverse family of over 610 genes. A limited number of these proteins have appeard in pr...
متن کاملDiscovering Domains Mediating Protein Interactions
Background: Protein-protein interactions do not provide any direct information regarding the domains within the proteins that mediate the interactions. The majority of proteins are multi domain proteins and the interaction between them is often defined by the pairs of their domains. Most of the former studies focus only on interacting domain pairs. However they do not consider the in...
متن کاملIn Silico Characterization of Proteins Containing ARID-PHD Domain and Its Expression in Aeluropus littoralis Halophyte
Abiotic stresses are the most important factors that reduce the yield of crops. In this case, Bioinformatics analysis plays an important role to study genes, and their relatedness as well as prediction their function in response to abiotic stresses. Among all domains, ARID-PHD domain has been identified in plants and animals and has a very significant role in growth regulation, cell cycle, and ...
متن کاملThe explanatioan of the relationship between the physical factors and the quality of the concept secondary domain with the level of social security of residents in residential settings in crime prevention based on the approach CPTED (Case study: Resi
Abstract Human beings have needs, security is one of the most important human needs that plays an important role in his/her and society's happiness and health, and housing is one of the basic needs of humans after food and clothing. But this need has become a serious problem due to the increasingly growth of demand for housing at the result of the increasing population and urbanization of cont...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Molecular bioSystems
دوره 10 5 شماره
صفحات -
تاریخ انتشار 2014